Fix: Gemma3TextConfig rope scaling assignments by RyanMullins · Pull Request #41934 · huggingface/transformers

RyanMullins · 2025-10-29T13:54:01Z

What does this PR do?

Related to #41922, this PR corrects the assignment of the rope_scaling dictionary present on some Gemma 3 pre-trained models on HF Hub when normalizing to the new rope_parameters value.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@zucchini-nlp PTAL since you have been handling the RoPE changes.

github-actions · 2025-10-29T13:55:26Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: gemma3

HuggingFaceDocBuilderDev · 2025-10-30T11:49:40Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Rocketknight1

I'm guessing the context for this is that rope_scaling clobbering rope_parameters (the old behaviour) gave the right config for older models but breaks on upcoming models that might have specific configs for e.g. sliding_attention?

RyanMullins · 2025-10-30T12:10:36Z

I'm guessing the context for this is that rope_scaling clobbering rope_parameters (the old behaviour) gave the right config for older models but breaks on upcoming models that might have specific configs for e.g. sliding_attention?

Yes, we've observed that rope_scaling was being applied to both full_attention and sliding_attention for the Gemma 3 checkpoints on HF Hub and in our attempts to re-convert them from the original Orbax checkpoint. The correct behavior is for rope_scaling to only be applied to full_attention if it exists in the config, and the sliding_attention config should always be default RoPE @ 10k.

Rocketknight1 · 2025-10-30T12:23:51Z

Got it, thanks for the fix!

* Fix: Gemma3TextConfig rope scaling assignments * Fix: type annotation for rope_parameters

RyanMullins added 2 commits October 29, 2025 13:48

Fix: Gemma3TextConfig rope scaling assignments

479d3f0

Fix: type annotation for rope_parameters

13ed5ed

Rocketknight1 approved these changes Oct 30, 2025

View reviewed changes

Rocketknight1 merged commit 02c324f into huggingface:main Oct 30, 2025
17 checks passed

i3hz pushed a commit to i3hz/transformers that referenced this pull request Oct 30, 2025

Fix: Gemma3TextConfig rope scaling assignments (huggingface#41934)

8a40643

* Fix: Gemma3TextConfig rope scaling assignments * Fix: type annotation for rope_parameters

This was referenced Oct 31, 2025

fix: dict[RopeParameters] to dict[str, RopeParameters] #41963

Merged

Improve typing support and flexibility for RopeParameters #41964

Open

SangbumChoi pushed a commit to SangbumChoi/transformers that referenced this pull request Jan 23, 2026

Fix: Gemma3TextConfig rope scaling assignments (huggingface#41934)

f744642

* Fix: Gemma3TextConfig rope scaling assignments * Fix: type annotation for rope_parameters

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: Gemma3TextConfig rope scaling assignments#41934

Fix: Gemma3TextConfig rope scaling assignments#41934
Rocketknight1 merged 2 commits intohuggingface:mainfrom
RyanMullins:gemma3-rope-fix

RyanMullins commented Oct 29, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 29, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Oct 30, 2025

Uh oh!

Rocketknight1 left a comment

Uh oh!

RyanMullins commented Oct 30, 2025

Uh oh!

Rocketknight1 commented Oct 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

RyanMullins commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

github-actions bot commented Oct 29, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Oct 30, 2025

Uh oh!

Rocketknight1 left a comment

Choose a reason for hiding this comment

Uh oh!

RyanMullins commented Oct 30, 2025

Uh oh!

Rocketknight1 commented Oct 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

RyanMullins commented Oct 29, 2025 •

edited

Loading